Chapter 4
Counting on Statistical Software
IN THIS CHAPTER
Examining the evolution of statistical software
Surveying commercial, open source, and free options
Considering code-based versus non–code-based software
Storing data in the cloud
Before statistical software, complex regressions we could do in theory were too complicated to do
manually using real datasets. It wasn’t until the 1960s with the development of the SAS suite of
statistical software that analysts were able to do these calculations. As technology advanced, different
types of software were developed, including open-source software and web-based software.
As you may imagine, all these choices led to competition and confusion among analysts, students, and
organizations utilizing this software. Organizations wonder what statistical packages to implement.
Professors wonder which ones to teach, and students wonder which ones to learn. The purpose of this
chapter is to help you make informed choices about statistical software. We describe and provide
guidance regarding the practical choices you have today among the statistical software available. We
discuss choosing between:
Commercial software, such as SAS and SPSS
Open-source software, such as R and Python
Free software applications, such as G*Power and PS (Power and Sample Size Calculation)
We also provide guidance on how to choose between code-based and non–code-based software, and
end by providing advice on cloud data storage.
Considering the Evolution of Statistical Software
The first widespread commercial statistical software invented is called SAS, and it is still used today.
SAS was developed originally in the 1960s and 1970s to run on mainframe computers. Around 2000,
SAS was adapted to personal computers (known as PC SAS), adding a user-friendly graphical user
interface (GUI). During the growth of SAS, other commercial statistical packages appeared, the most
popular being IBM’s SPSS. SAS continues to be the go-to program for big data analysis, where
analysts can easily access large datasets from servers. In contrast, SPSS continues to be used on a
personal computer like PC SAS.
If you were to take a college statistics course in the year 2000, your course would have likely taught
either SAS or SPSS. Professors would have made either SPSS or SAS available to you for free or for
a nominal license fee from your college bookstore. If you take a college statistics course today, you